Choosing a basis that eliminates spurious solutions in k • p theory 
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A small change of basis in k ■ p theory yields a Kane-like Hamiltonian for the conduction and 
valence bands of narrow-gap semiconductors that has no spurious solutions, yet provides an accurate 
fit to all effective masses. The theory is shown to work in superlattices by direct comparison with 
first-principles density-functional calculations of the valence subband structure. A reinterpretation 
of the standard data-fitting procedures used in k ■ p theory is also proposed. 
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I. INTRODUCTION 

The Kane model for coupled conduction and valence 
electrons in narrow-gap bulk semiconductors^^ was first 
applied to superlattices three decades ago4 Today this 
model is still used frequently for the study of medium- 
and narrow-gap nanostructures.— i&L &V ! 11 ! 12 ! 13 Kane's 
theory has a notorious pitfall: spurious solutions with 
large crystal momentum k, which arise from small Hamil- 
tonian matrix elements of order k 2 ,I£i£ 1 ] i>iiLI&i2i2ii2Ii22 

23.24.25.26.27.28.29.30.31.32.33.34,35.36.37.38.39.40.41.42.43.44.45.46 

Z&^ i 11 ! 12 ] 13 ! 47 Spurious propagating waves pose a seri- 
ous problem, since their presence within the energy gap 
changes the physical character of the model system from 
semiconducting to metallic. 

Many schemes for eliminating the unphysical effects of 
spurious solutions have been proposed (e.g., changing or 
adding parameters in the Hamiltonian, or excising the 
offending modes numerically or analytically), but none 
has yet found wide acceptance. The relative merits of 
the various proposals are not discussed here. Instead, it 
is merely noted that all of these schemes take the form 
of patches applied to Kane's original k • p theory. The 
possibility of reconstructing k • p theory on a different 
foundation has not been considered. 

This paper derives from first principles an 8 x 8 k • p 
Hamiltonian with no spurious solutions. The key step is a 
slight change in the standard choice of basis. This allows 
the adjustment-of-parameters method of Ref. 1321 which 
was proposed only as a useful approximation, to be for- 
mulated rigorously. The present derivation proves that — 
within the limitations imposed by a second-order differ- 
ential equation — this method is not an approximation. 
That is, all terms of order k 2 derived from a clearly de- 
fined basis can be included without approximation. (The 
number of fitting parameters can be reduced with a few 
standard approximations ; 2 ' 32 ' 48 but that is not a funda- 
mental limitation of the method.) The change of basis is 
applied here to the first-principles envelope-function the- 
ory developed in Refs. 49, 50, and 51. A comparison with 
density-functional calculations on Ino.53Gao.47As/InP su- 
perlattices shows very good agreement. 

In conventional k • p perturbation theory , 52 ' 53 one uses 
a unitary transformation to construct a basis in which 
the k • p coupling between the states of interest (set A) 



and all other states (set B) is reduced to zero, while si- 
multaneously renormalizing the masses in A and B. If A 
includes the highest valence and lowest conduction states, 
the k ■ p coupling within A is either set to zero (in single- 
band effective-mass theory 3 -^) or not modified at all (in 
the multiband Kane theory 2 ^ 3 -^ 3 -). 

In the present approach, a unitary transformation is 
used to modify the conduction-valence k • p interaction 
by only a small amount. The coupling can be cither 
strengthened or weakened; its actual value is fixed (in 
one of several possible choices) by setting the partially 
renormalized conduction-band mass to zero. This is pre- 
cisely the method used to eliminate spurious solutions 
in Ref. [32j. However, the interface operator ordering de- 
rived here is more subtle than the simple heuristic model 
of Ref. The present theory also suggests the need for 
a reinterpretation of the standard data-fitting procedures 
used in k ■ p models. 

The situation encountered here is analogous to a gauge 
transformation in quantum electrodynamics. Although 
all gauges are equivalent in exact calculations, differ- 
ent gauges may yield different predictions in approxi- 
mate calculations^ Likewise, the unitary transformation 
defined here would have no effect in an exact calcula- 
tion, but in a second-order k • p Hamiltonian of finite 
dimension, varying the parameters of the unitary trans- 
formation generates a metal-insulator phase transition in 
the model system. The remedy proposed here is simply 
to choose transformation parameters that lie within the 
physical (i.e., insulating) regime of the phase diagram. 

The paper begins in Sec. [TTJ with the definition and 
application of the unitary transformation to bulk semi- 
conductors. The theory is extended to heterostructures 
in Sec. IIHI and applied to the widely used Pidgeon-Brown 
Hamiltonian 48 in Sec. IIVI Numerical applications of the 
theory are presented in Sec. [V] Finally, the results of the 
paper are summarized and discussed in Sec. IVI1 



II. BULK CRYSTALS 

A. Hamiltonian 

Consider first the case of a bulk semiconductor. It is 
assumed at the outset that a Luttinger-Kohn (LK) uni- 
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tary transformatio n 52 ! 53 has already been used to elim- 
inate the k • p coupling between sets A and B. Thus, 
the effective Hamiltonian H for states in A is (in the LK 
basis) 



(nk|ff|nV> =H nn ,(k)6 kW 
H nn >(k) = E n 5 nn > + k t ir l nn , + kikjD l nn , 



(la) 
(lb) 



in which E n is the energy of state n at k = 0, ir nn , is the 
i component of the kinetic momentum matrix, and D*jL, 
is the inverse effective-mass tensor (in atomic units) 



D lJ , 

nn' 



1 V- ( Kj4n 

2VV u n i 



-il " lr 



(2) 



In this paper, the modified Hamiltonian ([3]) is derived 
by applying a unitary transformation e s to the original 
Hamiltonian (JlJ: 



H = e~ s He s = H 



[H,S} + -[[H,S},S] 



(6) 



where S = —S^ has matrix elements only within set A. 
The generator S is defined by 



(nk|S|n'k / )=5 nn /(k)5 kk 

&nn'Q*-) = ^iS m 



in which the linear coefficient is 



<? 1 , = 

^nn 



nn' 



(7a) 
(7b) 



(8) 



where tu n i = E n — Ei . Here and below all equations are 
written (for simplicity) as if the potential energy were 
local, although a nonlocal pseudopotential was used for 
the numerical calculations in Sec. [V] The term a^ n , is 
a matrix element of the Pauli spin operator, which ac- 
counts for the intrinsic magnetic dipole moment of the 
electron.— 

In Ref. HH, it was assumed to be permissible to treat 

^ nn' ^* a ^ 

adjustable parameter in Eq. |T]). In this ap- 
proach, ir nn , is replaced by ir nn , = n nnl +Aitl in ,, in which 
Air nn , has the same symmetry as ir nn , and vanishes when 
E n = E n i , but is otherwise arbitrary. The Hamiltonian 
(|lbp is then replaced by 



Hnn'O*) = E n 8 ml f + ki1T l n 



(3) 



in which the matrix D% , is adjusted to maintain agree- 
ment with all experimental effective masses. This con- 
straint does not, however, completely determine D^ n ,. 

To see this, consider applying Eq. ([2]) separately to set 
A and to the subset A nn < C A defined by A nn ' — {|"-"k) | 
mm(E n , E n i) < E n u < max(_E„, E n >)}. A comparison of 
the results for A and Ann' gives 



If S': 



0, the change AD 



nn' 



nn 



AD 13 , 

nn ! 



s-^i'AiriM 



i 



'nl'Un' 



Un'l 



(9a) 



in which TT nn , = n l nn , + ^ATr nn ,. Note that if we choose 

A Kn' = ~<n' ( for E n + E n >), then 7f* n , = and 
Eq. (f9"aj) just adds extra terms to the summation in @. 
Thus, if set A comprises the highest valence and low- 
est conduction states, one-band effective-mass theory is 
given by Air nn , — —w nn ,, while the Kane model is given 
by Air nn , = 0. (See AppendixfAlfor an alternative matrix 
formulation of this result.) 

Equation (|9"a)) can be rewritten as 



2 , \Wnl Wn'lJ 



+ 



-E 



Unl Wn'l , 

a< A - KA<' 

UnlWn'l 



, (9b) 



D t3 (A A = D tJ 



■Ann' 

E 

i 



l nl n ln' 
UJnl 



il n lr, 



(4) 



where D^ n , = D^ n , (A) and Ann' = A \ ■Am' is the com- 
plement of A nn ' in A. When E n = E n i , D^ n , (A nn ' ) is 
an experimentally measurable effective-mass parameter 
for the subspace A nn ' ■ 

If 7r nn , is treated as an adjustable parameter (n l nn , — * 
^nn') anc ^ ^nn'^n') ^ s ass umed to be independent of 
{A7r^ n ,}, then D% n , must satisfy 



D lJ , 

nn' 



DZ'i^nn') 



E 



l nl n ln' 
Unl 



L nl n ln' 



(5) 



However, in general it is only necessary for Eq. (|5|) to be 
satisfied when E„ = E„i . This still leaves some freedom 



of choice in the definition of D l ^ n , 



which shows that Eq. |5| is satisfied when E n — E n > , but 
not (in general) when E n ^ E n > ■ However, the degree of 
freedom corresponding to the coefficient S^ n , in Eq. (T7]) 
has not yet been used. Let 



8D ij , 

qij nn' 

°,i,-,' — 



(10) 



in which 8D^ n , has the same symmetry as D^ n , and van 



ishes when E„ 



E,, 



but is otherwise arbitrary. This 

, given 

nn' 6 iv, -' il 



has the effect of adding 8D % ^, to the value of AD lJ 



nn 



by Eq. ([9]) . In this way, one can set the parameters D n 
for E n ^ E n ' to any desired value, including zero. This 
is merely a reflection of the fact that the terms D l £ , with 
E n ^ E n ' do not contribute to the single-band effective- 
mass Hamiltonian; 3 - since their contributions are of order 
k 3 or higher. 
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As a particular example, one could choose 



y _ 



5 51 (*Wn' " ( — + — 



2 ^ 



(11) 



which would bring Eq. @ into agreement with Eq. ([5]). 
However, including these terms would make subsequent 
analysis more complicated, so for simplicity the choice 
8D^ n , = is adopted in the remainder of this paper. This 
choice makes little practical difference, since Eq. (TTTI) is 
in fact zero in the Kane model when spin-orbit coupling 
is neglected in the momentum matrix fL 2 ^^ which is the 
only example treated explicitly here. 



spurious roots k s ^_ are unbounded near the zeros of cJ„(n), 
disappearing at d v (tii) = because the order of the secu- 
lar equation is reduced. In typical cases (see Sec. IIV E"| . 
k s f changes from large real to large complex values (or 
vice versa) in the neighborhood of each singular point 
<L(n) = 0. 

Unphysical metallic behavior can be avoided by choos- 
ing {An l nn ,} (or in general S) such that the spurious 
roots disappear. As shown below, in the Pidgeon-Brown 
modelj^ this can be achieved for all directions n by 
setting the conduction-band mass parameter A = 0, 
which is the choice used in Ref. [32|. This choice may 
not work in all models, but one can also choose S such 
that Im^^ 1 ) ^ for all n (or more precisely such that 
In^fc^ 1 )! > fco > 0, where kg is some chosen value). The 
implementation of these choices is discussed in greater 
detail in Sec. ITVCl 



B. Definition and elimination of spurious solutions 



C. Velocity 



The preceding theory can now be used to define spuri- 
ous solutions precisely. Spurious solutions are often de- 
fined as eigenstates of the k • p Hamiltonian with large 
wave vectors, but this definition is not completely sat- 
isfactory because spurious states are sometimes found 
well inside the first Brillouin zone.— As emphasized by 
Bastard , 21 i 24 more important than the magnitude of the 
wave vector is its instability with respect to small changes 
of the Hamiltonian parameters. The unitary transforma- 
tion allows such a change of parameters to be per- 
formed even when the k • p Hamiltonian is calculated 
directly from first principles. 

Let the wave vector be k = ky + nk±, where n • n = 1, 
n • k = 0, and n and ky are real. A spurious solution is 
defined here as a root k±(E, k||) of the secular equation 



2N 



det[H (k) - E] 



k„)fc 



(12) 



1=0 



that is an unbounded function of {ATr l nn ,} for small 
{Air l nn i} and kn and for real E near the energy gap. 
(Here iV" is the dimension of set A) This definition does 
not encompass all possible types of spurious solutions 
(see, for example, those generated by Hamiltonian ma- 
trix elements of order k 4 in Sec. [V] and Ref. 1511 ). but 
it does include those that can be treated effectively by 
the present unitary transformation^ This definition has 
the advantage of simplifying subsequent analysis because 
it focuses attention on the asymptotic properties of the 
secular equation at large k± rather than the general prop- 
erties of the secular equation at arbitrary k±. 

Within the stated limits, all coefficients q in the sec- 
ular equation (T2"|) are bounded (i.e., |q| < 1 in atomic 
units). The roots k±(E,\i\\) can therefore be unbounded 
only near C2N = 0. For a given direction n, C2N is just 
the product of eigenvalues d„(n) (v — 1,2,..., N) of the 
matrix D(ii) = hifijD^ . Hence, as {Air^,} varies, the 



Although the transformation ^ replaces 7r with 7r 
in the Hamiltonian, it does not do so in the velocity 
v = — i[x, H], where x is the coordinate. As shown in 
Appendix [Bl the effective velocity v = e _s ve s for set A 
is given to first order in k by an expression of the form 
([Taj) with 



v nn' W — n nn' + h^nri' + kj / 



nl"ln' 



+ 



(13) 



This shows that v„„' (k) ^ "^^H nn i (k) even to zeroth 
order in k. That is, the velocity to order k° is tt, not 77, 
for both v and v. In the special case A7z nn i = — 7r„„/, 
Eq. (|13p is equivalent to the optical transition matrix 
element given in Eq. (19) of Ref. [571 



D. Implications for parameter fitting 

The above results suggest the need for a reinterpreta- 
tion of prior work on experimental fitting of k • p parame- 
ters. In a model with a complete set of D] ] ln , parameters, 
the empirical masses and Lande g factors are not suf- 
ficient to determine H; in fact, for E n ^ E n > , Tr z nn , is 
arbitrary. This indeterminacy could in principle be re- 
solved by fitting v to measured oscillator strengths, but 
this is not usually done because optical transition rates 
are considered to be less reliable than resonance frequen- 
cies. Instead, the most common procedure is to fix a 
few values of D% n , by setting the contributions from B to 
zero or some other convenient value (see, e.g., Refs. HH, 
HH, Hil and 60), thereby permitting a deterministic fit of 
■K l nn , from frequency data. 
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However, this procedure is nothing but the present 
transformation (albeit without explicit recognition that 
a change of basis is involved) with ' , chosen for cri- 
teria other than the elimination of spurious solutions. 
The outcome of the fitting procedure is thus it, not tv 
(although typically tt ks it). This shows that the pro- 
duction of spurious gap states by many k • p parameter 
sets is not purely a matter of experimental necessity but 
at least partially an artifact of choices made in simplify- 
ing the D matrix. Fitting ' , to nonparabolic effects 61 
fails to resolve the quandary because the 0(k 4 ) terms 
needed for a correct description of nonparabolicity have 
been omitted. [Including 0(fc 4 ) terms in the experimen- 
tal data fitting is also of no help because it merely shifts 
the indeterminacy to a larger set of parameters.] In the 
absence of direct measurements of tt, it is not possible to 
distinguish 7r from 7r (i.e., to define unambiguously the 
original LK basis) without using a microscopic model to 
calculate some or all of the k • p parameters (see, e.g., 
Refs. [62| andHH). Of course, the results are then only as 
good as the chosen model. 



III. HETEROSTRUCTURES 

The next step is to extend this change of basis from 
bulk crystals to heterostructures. Here it is applied to 
the nonlinear response theory of Refs. |49|, H3, and Hi], in 
which the heterostructure is treated as a perturbation of 
some virtual bulk reference crystal. To first order, the 
effective A Hamiltonian is H — H^°> + where the 

reference Hamiltonian is handled according to the 
above methods, and the linear Hamiltonian i a 49 ' 51 

(nk|tf«|nV) = ^'0 Q (k-k')iO(k,k'), (14a) 



is defined by an expression similar to (|14aP with 

k v ia 4- v Q ' V 

™ /-r. v »A rm ' 1 Ann' t 

°nri V K i K I — ■ 

Here X™< is only part of the change in n 1 ^, , since 



(15) 



A-7T* J^ a 

a ia ia \ A nl In' 

nn' Ann' 



nn y^-nn / j 
I 



AtT Q \ = Y m 
nn' A.nn 



I 



Un'l 



(16a) 



(16b) 



where Att"^, = (ATr™ n )* . Likewise, the changes in the 
linear D tensor are given by 



nn' / j 



I 



A i zja ~i ia 
^nl^ln' i w nlXl n > 



AD m3 , 

nn' 



AD 1 " 3 , 

nn' 



I XnlH 



I 

■A / A i ~otj ~ i aj 



I 



i 



(17a) 
(17b) 

(17c) 



U n 'l 



where AD«% = (AD^J* and AD% = (AZ#*)*. This 
system of linear equations can be solved for Air" as a 
function of AD a . An equivalent matrix formulation of 
Eqs. (fl~6|) and (fT7|) is given in Appendix [A] 



IV. THE PIDGEON-BROWN MODEL 



A. Conduction band 



where the sum covers independent value s 49 ' 50 of a and 



Hnn'i^k') - E" n , + ^ a S nn >S kk > 



kir ta 



,k' 



_i_ h k D %3a 
-t- K t K 3 lJ nn , 



k D %a3 k' 
^ U nn'^j 



Kim- ( i4b ) 



Here a (k) is the Fourier transform of 9 a (R), which is 
the change in fractional weight of atom a in cell R of 
the heterostructure relative to the reference crystal. The 
coefficients in (|14bl) are defined in Ref. they have the 
symmetry of site a in the reference crystal and satisfy 
hermiticity relations such as = (D^,")*. The super- 
scripts on these coefficients indicate how the coordinate 
and momentum operators are ordered. For example, in 
the coordinate representation, the term proportional to 
has the BenDaniel-Duke ordering 64 D l " 7 3 ,pi9 a (x)pj, 
where p is the momentum operator For a bulk pertur- 
bation of the form 9 a (k — k') = # a <5kk', operator or- 
dering is irrelevant and only the sums ir^, + ir^, and 

D nn' + D Tn' + D nn' can be distinguished. 

The unitary transformation (JSJ) is now applied with 
S = + SW, where is the same as © and S^> 



As an example, consider Pidgeon and Brown's formu- 
lation of the Kane model for a zinc-blende crystal^ The 
set A — {Tec, rsv, r7 V } is defined in the tensor-product 
basis {\S), \X), \Y), \Z)}®{\+), |-)}, with spin-orbit cou- 
pling included only to order k°~&&£!L For the bulk ref- 
erence crystal, the relevant conduction-band (CB) con- 

From Eqs. Q and 



stants are A — -Dgg and P 



'sx- 



© , the values of A and P are related to the CB effective 
mass m c by 



1 



p2 _ p2 

A+ — = A + — . 



2m c 

in which e n is the n th -order reduced energy gap: 
1 2 1 



(e„)« 3(£ g )« 3(£ g + A so )"' 



(18) 



(19) 



where E„ 



— -^6c — E$ v and A so = _c/g v — rj-j v . 
of P = —iirgx needed to obtain a desired change AA 
A — A is therefore 



Ea 



E 7v . The value 



P z 



P 2 - eiAA 



(20) 
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The selection of suitable values of AA and P is discussed 
below in Sec. HVT^I 

For the linear response in a heterostructure, there are 
two independent CB partial mass coefficients, A " a = 
A a " and A°" (where A " a = D x s x s a , A a " = D% x s x , and 
A' a ' = Df/cf), and two independent momentum parame- 
ters, P a ' — —i^sx aim P a = -iTsx- Upon solving Eqs. 
dH]) and (JT7J) for the changes AP"' and AP' a needed to 
obtain desired values of AA" a and AA' a ' , one finds 



AP a = 
AP a = 



e\AA a " P a AP 


e\AAE* 


P 


p 


2e 2 P 


e\AA' a ' P 


a AP 


+ - 


2P 


P 


2elP 


Eg S and 






E v *E Sv 






4 W + 


3(P g + A so ) 2 • 



(21a) 
(21b) 

(22) 



If one adds (|21a|) and (|21b|) to obtain the total linear 
change AP a = AP a '+AP a for a bulk crystal, the result 
is identical to what is obtained from linear variation of 
the parameters in Eq. (|2"0"1) . 

Equations (|2TJ|) and (|2"Tj) can now be inserted into (JTTJ) 
to determine the changes in the other mass parame- 
ters. The partially renormalized bulk CB Lande factor 
g = —i2(Dg y + s+ — D s x + S+ ) is related to the fully renor- 
malized experimental value g c by 



to have little effect on the calculated energy levels; there- 
fore, it was treated as an adjustable parameter (A —> A, 
P — ► P), with A — I chosen for simplicity. The Lande 
factor, however, was held fixed at g — 2. The present re- 
sults show that the assumption Ag = must be regarded 
as an approximation because it cannot be reduced to a 
unitary transformation. 

According to Eq. (j25|) , Ag will be negligible in compar- 
ison to A A if the spin-orbit coupling is small (A so <C E g ). 
However, the Pidgeon-Brown model is often used in cases 
where A so ~ E g or even A so ^> E g . In such cases, setting 
Ag = is no more justifiable than setting AA = when 
AP ^ 0. But this problem is easily resolved by using the 
value (|25|) for Ag (assuming, of course, that the correct 
original value of A is known) . 

The Kane interband parameter B = D s v z + D v s x z is 
neglected in the Pidgeon-Brown models because it cor- 
responds to 0(k 3 ) terms in the single-band Hamiltonian. 
The value of B is not affected by the linear term S l nn , 
in the generator ([7]) (i.e., AB — 0) because B does not 
depend on P. Therefore, neglecting B is a consistent 
approximation in the sense that B = implies B = 0. 
Alternatively, one could choose 5D\° ln , in Eq. (|10[) to sat- 
isfy SB = —B, thus obtaining B = even when B ^ 0. 



B. Valence band 



1. Zero spin-orbit coupling 



9c = g 



where 



4p2 

1 



4p2 

1 



The change Ag = g 



(£g) n (E t 
g is therefore 
4A 



+ A so y 



(23) 



(24) 



-)AA (25) 



3E g + 2A so J 

Likewise, the changes in the linear-response terms are 



Ag a - 
Ag a 
where 



-AA°- - -AAE" 



4e 

— AA' a ' 
35i 



3 
4 



Si4 



si 



-AA 



h4 5 2 2 



ft 



rr ci. 

E Sv 



E? v 



5 2 2 {E g y (e s + A so y 



(26a) 
(26b) 

(27) 



Note that g' a ' is also the linear contribution to the CB 
Rashba coefficient.— 

When spin-orbit coupling is neglected in the remote 
B states^^^ one has simply g = 24£ In the original 
paper of Pidgeon and Brown^ the value of A was found 



For the valence band, consider first the case without 
spin. There are four independent Tisv parameters^ L = 



DTx, M 



n xy 



D y x Y . From Eqs. (gj) and © we have 



D y x x Y , and K = D X X V Y 



L° =L- P 2 /E' g 
AI° = M 
N° =N - P 2 IK 
K° = K - P 2 /E' g 

where [see Eq. (gj)] L° = 
ated for the subset .Ao 
E g + iA so is the energy gap in the absence of spin-orbit 



L - P 2 /E' g , 
M, 

N - P 2 /E' 
K - P 2 /E' 



(28a) 
(28b) 
(28c) 
(28d) 



L(Aq) is the parameter L evalu- 
= {r 15v }, and E' = E lc - E 15v = 



splitting. Under these conditions ei 
changes are simply 



E' so the bulk 



AL = AN = AK 



-AA, 



AM = 0. 



(29) 



Likewise, for the linear response, AM a " = AM' a ' = 
and 



AL a " 
AL' a ' 



AN a " 
AN' a ' 



AK a - 
AK a - 



-\AA a \ 
-2AA a \ 



(30a) 
(30b) 



Again, the total bulk linear variation AL a = AL" a + 
AL' a ' + AL a " is consistent with (|29p . However, the 
interchange of CB and VB operator orderings in (f3"0"l) 
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is a new feature that was not predicted by the simple 
model of Ref. (where only the numerical value of P 
was changed, and all terms with the ordering X a " were 
excluded^). 

Although it vanishes in bulk, the linear VB momentum 



does have one independent constant R' a = 



t xy (with 

-R a )£L This term is not affected by 
the change JT5J; i.e., AR a = AR a = 0. 



-ITT 



XY 



2. Nonzero spin-orbit coupling 

In the Kane modelji^^ spin-orbit coupling is 
included to order k by adding the perturbation 
H so = |A so (I • cr) to the Hamiltonian for set A = 
{T§ c , Tgv, r7 V }, where I is the orbital angular momen- 
tum operator^ When working in a basis that diagonal- 
izes iJsofi^i^ it is convenient to define the following 
linear transformation of the mass parameters 



.66 



71 
72 
73 
K 



2M), 
M), 



■iiV, 



L 

M 
N 
K 



-5(71+472), 
4(71-272), 
-373, 
-3k - 1. 



(31a) 
(31b) 
(31c) 
(31d) 



Here 71, 72, 73, and k are the modified Luttinger pa- 
rameters introduced by Pidgeon and Brown4^ Upon ap- 
plying Eq. (J4]) to the subset Aq = {rg v }, one finds the 
relations^ 



7? 
72° 
7 3 ° 



7i 
72 
73 



h 2P 2 /3P g , 
FP 2 /3P g , 

P 2 /3P g , 



(32a) 
(32b) 
(32c) 
(32d) 



where 7°, 7$, 73, and k° are the original Luttinger 
parameters^ for Ta v ■ The Luttinger parameter q° = q is 
neglected in the Kane model because (to leading order) 
it is proportional to the spin-orbit splitting of the remote 
B states j££ Since q is independent of P, it is not affected 
by the unitary transformation 

It is also convenient to introduce a linear transforma- 
tion of the form (|3"T) for the parameters 7", 7°, 7°, and 
k° in (l32|) . When this is done, one finds that the pa- 
rameters L°, M°, N°, and K° are related to L, M, N, 
and K by expressions similar to those given earlier in Eq. 
(|28|) . The only difference is that the P15 energy gap E' g 
is replaced [see Eq. (f3"2"j)] by the Tg gap E g . 

Now consider using Eq. (T5|) and (O to determine how 
the effective-mass parameters change when the unitary 
transformation ^ is applied. For the r 8v submatrix, 
the results are similar to Eqs. (f2"B"| (with E ' — > E g ) and 



(33a) 
(33b) 



?° = Q & - P 2 /E g 
= Q S - P 2 /E g , 



where Q is any member of the set {L, N, K, — f 71, —372, 
—373, —3k} (M does not depend on P). The subscript 
8 is added to emphasize that Eq. l[33|) holds for F 8v only. 
Here Qg is an original Luttinger parameter, while Qg = Q 
and Qa are the modified Luttinger parameters before and 
after the unitary transformation. 

However, when Eqs. (j4]) and ([9j) are applied to the T7 V 
submatrix, the results are different: 



Q7 = Qi 



P 2 /(E g 
P 2 /(E e 



A. 







(34a) 
(34b) 



In the Kane model, Q 7 = Qg = Q, but clearly Q 7 7^ 
and (when P 2 ^ P 2 ) Q7 7^ Qs- Such differences also 
occur in the x Fg v submatrix, where the parameters 
are given in terms of the above results by Q 7S = \{Q 7 + 

Qi), Qts = §(Q 7 + Qs), and Q 78 = |(Q 7 + Q 8 )- The 
changes in Q for each submatrix are therefore given by 



AQg = - 



eiAA 



AQ 7 = 



eiAA 



AQ 78 = -(AQ 7 + AQ 8 ). 



(35) 



The result Q 7 ^ <3 8 merely reflects that the Luttinger 
parameters for T7 V are different from those for Fg v . (This 
fact is sometimes used to obtain an experimental fit for 
p^blSM) However, the result AQ 7 ^ AQ$ shows that 
the unitary transformation ([6]) does not preserve the ini- 
tial equality of the modified Luttinger parameters in the 
T8v, T7 V , and F7 V x Lg v submatrices. 

The latter result is hardly surprising, because the mod- 
ified Luttinger parameters are known to have different 
values in each submatrix when spin-orbit coupling is 
treated exactly^iHiZ 2 . The inequality AQ 7 ^ AQ 8 is 
therefore nothing new from a physical standpoint. Nev- 
ertheless, it does serve to show that the standard exper- 
imental data-fitting procedure — namely, treating P as a 
fitting parameter and defining the modified Luttinger pa- 
rameters for all of set A from the Fg v Luttinger param- 
eters via Eq. (|32p — is not equivalent to a simple unitary 
transformation. Instead, as P is varied during the fitting, 
one must invoke the additional approximation of replac- 
ing AQ7 with AQg in order to preserve equality of the 
modified Luttinger parameters in all submatrices. 

The assumption that the modified Luttinger parame- 
ters have the same value in all submatrices even when P 
is treated as a fitting parameter will be referred to as the 
Pidgeon-Brown approximation (PBA), since these au- 
thors seem to be the first to use it explicitly^ (although 
this approximation is implicit in the theory of Kanei*^) . 
The validity of the PBA was studied by Boujdaria et 
cd.r^ who found that it works well in many materials^ 
However, it should be noted that the error 



AQ 8 -AQ 7 = -^AA = - 



3A S 



3E„ + 2A S 



AAA (36) 



in making the replacement AQ 7 — > AQ 8 is of the same 
order as Ag in Eq. (|2"5)) . In Sec. IIV Al it was argued 
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that Ag is not generally negligible. The error in AQ7 is 
negligible in the present context not because AA is small 
(although it usually is), but because this error affects pri- 
marily only the spin-orbit split-off T7 V band. This band 
is typically not of direct experimental interest unless 
A so C£ g) in which case the error (f36|) is negligible. 

In a heterostructure, the linear changes are similar to 
the spin-zero expressions (|30l) . In keeping with the PBA, 
only the Tg, results are given here: 



AQ a - = - 
AQ a - = - 



eiAA' Q - 
2 ei AA a - 

e7~ 



e 1 AA(E$ v 
2E„ V E„ 



(37a) 
(37b) 



The present method could, of course, be used without 
the PBA; this possibility is discussed below in Sec. IIVDI 



C. Choice of parameters 

Procedures for choosing P to avoid real spurious so- 
lutions have not yet been specified. In Sec. IIIBI it was 
shown that spurious roots k s f disappear at the zeros of 
the eigenvalues d„(n) of the matrix D(ti) = fiifijjj^ . In 
the PB model, D is block diagonal: D = D c © D v . The 
eigenvalues of the CB block P/ C (n) are independent of the 
direction n: cf£(n) = A (where v — 1,2). Thus, one can 
eliminate spurious solutions for all n by setting A = or 
P = P c , where 



P 2 = P 2 + e x A = e 1 /2m c 



(38) 



This was the choice adopted in Ref. [32]. As shown there, 
for typical semiconductors \e\A\ <C P 2 , so P c w P and 
the resulting changes in the Hamiltonian are small^ 

Other choices of P can also be used to obtain physi- 
cally meaningful results. In the limiting case P = of 
single-band effective-mass theory, all states within the 
energy gap are evanescent. Spurious solutions cannot be 
identified in this case because the CB and VB states are 
completely decoupled. For infinitesimal P, there is an in- 
finitesimal anticrossing of the evanescent gap states; the 
spurious solutions can then be identified as the branches 
with Im k± ^ over the entire range of energies near 
the band gap. Spurious gap modes remain evanescent 
for all n in the finite interval < P 2 < Pq, where 
P 2 = mm(P 2 ,P 2 ), P 2 = min fi P y 2 (n), and P 2 (n) is 
the smallest value of P 2 where any rf^(n) = 0. 

As shown in Appendix [Cl when the Luttinger param- 
eters satisfy 73 > 72 (which is true for most semiconduc- 
tors^^), the constant P 2 is (in the PBA) simply 



acceptable without any changes at all (AA = 0, P = P). 
If not, a valid alternative to setting A — is to choose 
a value of P 2 within this interval,— preferably near the 
upper bound Pq in order to minimize AP. For typical 



semiconductors both 



E g L\ 



< P 2 and \eiA\ < P 2 



so 



the changes in the Hamiltonian are small regardless of 
is equal to P 2 



whether Pq 



or P 2 . 



D. Beyond the Pidgeon— Brown approximation 

The PBA used here is open to the objection that it does 
not provide an exact description of the mass of the spin- 
orbit split-off r 7v band^i (although, as discussed above, 
it provides a good approximation in many cases^). This 
deficiency can be remedied by applying the present uni- 
tary transformation to the Hamiltonian of Weiler et a?.,— 
which includes a full set of independent parameters for 
T7 V . In particular, the momentum matrix element P has 
different values P$ and P7 for the coupling of Tq c to T$ v 
and T7v , respectively. Equations (fT8j) and (|23|) for the CB 
effective mass and g factor must therefore be replaced by 



A- 



and 



2P| 



+ 



3E g 3(P g + A so ) 



A- 



2P| 



3P g 3(P g + A so ) 



4P| 4P 7 2 
3E~ g + 3{E i 



4P 2 



4P 2 



A so ) 



3P g 3(P g + A so ) ' 



(40) 



(41) 



Since there are two independent momentum parameters, 
one can choose the values of Pg and Pj in order to achieve 
desired changes in both A and g: 



Pi = Pi-E s (AA-lAg), 



P 7 2 = P 7 2 



(P g + A so )(AA + iA ff ). 



(42a) 
(42b) 



For example, one could choose AA = —A and Ag = —g 
in order to set the entire CB D matrix to zero. Alter- 
natively, since g has no effect on spurious solutions, one 
could choose Ag = in order to minimize the changes 
in the Hamiltonian. In cither case, Eq. ^ can then be 
applied as usual to determine the changes in the other D 
parameters. However, since the Hamiltonian of Ref. [73 
contains many parameters that are not commonly used 
in k • p calculations, 6 — this procedure will not be carried 
out in detail here. In this case, it may be more conve- 
nient to calculate the Hamiltonian changes numerically 
using the matrix equations given in Appendix fAJ 



P 2 



P 2 



E g L 



-L E„. 



(39) 



If P 2 lies within the interval < P 2 < P 2 



which is the 

case in the numerical examples considered below — then 
the spurious gap states in the original k • p Hamiltonian 
are evanescent for all n, and the Hamiltonian is physically 



E. Two-band model 

In the special case of a bulk crystal with no spin-orbit 
coupling and k = (0, 0, k), the \X) and \ Y) valence states 
are not coupled to the other states. Spurious solutions 
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in the PB model are consequently confined to the two- 
dimensional basis {\S), \Z)} with Hamiltonian 



H(k) = 



E c + Ak 2 iPk 
-iPk E v + Lk 2 



(43) 



This case is of interes t 14 ' 15 i 20 ' 27 i 46 because it allows sim- 
ple analytical calculations of the properties of spurious 
solutions; it will also be used in some of the numerical 
work in Sec.[V] The original Hamiltonian parameters are 
assumed to satisfy the constraints E' = E c — E v > and 



P 2 + E'A > 0, P 2 ~ E'L > 0, 



(44) 



in which the first condition is equivalent to m c > [see 
Eq. (US])] and the second is equivalent to L° < [see Eq. 
(|28aj) ]. It is also assumed that P 2 > and P 2 > 0, which 
from Eq. (20]) requires that A A < P 2 /E' g . 

The secular equation (fl"2"]) for the Hamiltonian (|4"3"j) has 



the form c^k + c 2 k + c 
(E c - E)(E V - E), and 



0, where C4 = AL, cq 



c 2 = A(E V -E) + L{E C -E)-P 2 
= A(E V -E) + L(E C -E)- P 2 , 



(45a) 
(45b) 



in which the second equality follows from Eqs. (fl~8"]) and 
(|28ap . Hence, the coefficients cq and ci are invariant with 
respect to the unitary transformation ([6]). If the Hamil- 
tonian (143]) were extended to include terms of order fc 4 , 
then C4 would also be invariant,— but this would gener- 
ate terms of higher order in the secular equation that are 
not invariant. In general, the highest-order coefficient in 
the (finite-order) secular equation is not invariant with 
respect to the unitary transformation ([6]). 

The ge neral solut ion to the secular equation is k 2 — 
(— C2 ± yc| — 4c4Co)/2c4, which shows that for bounded 
cj, k is unbounded only when C4 — * 0, as discussed in Sees. 
IIIBI and llVCl The spurious solutions have a particularly 
simple form when E = E v or E = E c : 



k 2 sp (E v ) = (P 2 -E' g L)/AL, 
k 2 p (E c ) = (P 2 +E> g A)/AL. 



(46a) 
(46b) 



For other values of E, note that 02(E) is a linear func- 
tion of E that interpolates between the values C2(E V ) = 
~{P 2 - E' g L) and c 2 {E c ) = ~{P 2 + E' g A). Thus, for 
small C4, k 2 p (E) interpolates approximately linearly be- 
tween the values (|46aj) and (|46b|) . 

From Eq. (J44J), the numerators of Eqs. (|46a|) and (|46b|) 
are both positive. Therefore, the spurious solutions are 
evanescent when AL < and propagating when AL > 
0322 Now AL = AL + AA(L-A) - (AA) 2 is a quadratic 
function of AA that has its maximum value when AA = 
\(L— A). According to Eq. (|4~4"]) . this value of A A satisfies 
the constraint AA < P 2 /E' g and is therefore permissible. 
When AA = \{L- A), A = L= \{A + L) and thus 



k 2 sp (E v )=4(P 2 -E' g L)/(A + L) 2 , 
k 2 (E c )=4(P 2 + E g A)/(A + L) 2 . 



(47a) 
(47b) 



These expressions give the smallest positive values of 
k 2 p (E v ^ c ) that are possible for any AA consistent with 
the given Hamiltonian parameters E' g , P, A, and L. They 
consequently represent the "worst" result that could be 
obtained from any unitary transformation ([6]). For the 
special case L = —A, real spurious solutions do not ex- 
ist for any AA, but for L ^ —A, real spurious solutions 
always exist for some AA. 

It should be noted that the present theory provides 
the first rigorous justification for the two-band model of 
White and Sham , 14 ' 15 in which it is assumed to be possi- 
ble always to choose A > and L < and to invoke the 
limit A — > 0. The assumption L — — A, however, is not 
generally valid. 



V. NUMERICAL EXAMPLES 

Numerical examples demonstrating the success of the 
A = method in eliminating spurious solutions have 
already been given in Refs. [H and[47]. Since the present 
PB Hamiltonian for the case A = is identical to that 
of Ref. ]32| in bulk material, those examples will not be 
repeated here. The main new feature is the interface 
operator ordering derived in Eqs. (|21[) . (|2"6"j) . (|3T)]) . and 

To demonstrate the validity of these results, the Tis 
valence™ subband structure of Ino.53Gao.47As/InP and 
GaAs/AlAs superlattices was calculated in a plane- wave 
basis using the ABINIT cod o 77 ' 78 ' 79 with norm-conserving 
pseudopotentials and the local-density approximation 
(LDA). Spin-orbit coupling was omitted, and all tech- 
nical details were the same as in Ref. [5CJ. These "ex- 
act" model calculations were compared with the first- 
principles envelope-function (EF) theory of Refs. and 
l5ll . which has no fitting parameters (not even the mean 
energy). As described in Sec. IIII1 the EF Hamiltonian 
was constructed using nonlinear response theory; for this 
purpose, the bulk reference crystals were chosen to be 
Ino.765Gao.235As .5Po.5 and Alo.5Gao.5As. 



A. Material parameters 

Calculated values of the parameters in the Kane 
Hamiltonian are presented in Table [I] for the bulk mate- 
rials of interest. This table includes values for the Kane 
parameter 2,3 B = L>" J Z + D V g Zl although this term is ne- 
glected in the PB model^ 8 - The values in Table Q] were 
calculated for the set A — {ri c ,ri5 V }, whereas those in 
Ref. [fO] were for the single-band case A — {Fi5 V }. The 
values of L, N, and K in Table U therefore differ from 
those in Table I of Ref. [5l| because they do not include 
the interaction with the Ti c state \S). Since the coupling 
to the remote B states is weaker in the present case, the 
parameters in Table U have a relatively small variation 
between materials, and the variation is nearly linear. 
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TABLE I: Material parameters for several bulk compounds 
and their virtual-crystal averages. The numbers in parenthe- 
ses were obtained from linear interpolation. The signs of B 
and P depend on the phase conventions chosen for the basis 
functions. Here the coordinate origin was placed on an anion, 
with a neighboring cation at ja{l, 1, 1); the phases were then 
fixed by setting (x|5) > and d(x\X)/dx > at x = 0. 





GaAs 


Alo. 5 Ga 


, 5 As 


AlAs 


A 


+0.280 


+0.345 


(+0.339) 


+0.399 


B 


-1.156 


-0.837 


-0.777) 


-0.398 


p 


—0.560 


—0.552 


(—0.553) 


—0.545 


L 


-0.435 


-0.345 


(-0.343) 


-0.250 


M 


-1.511 


-1.326 


(-1.318) 


-1.126 


N 


-1.553 


-1.373 


(-1.366) 


-1.179 


K 


+2.507 


+2.310 


(+2.303) 


+2.100 




Ino.53Gao.47As 


Ino.7e5Gao.235 


As . 5 Po.5 


InP 


A 


+0.321 


+0.292 


(+0.299) 


+0.278 


B 


-0.986 


-0.954 


(-0.927) 


-0.869 


P 


-0.534 


-0.505 


(-0.504) 


-0.474 


L 


-0.429 


-0.413 


(-0.410) 


-0.390 


M 


-1.385 


-1.253 


(-1.252) 


-1.119 


N 


-1.423 


-1.296 


(-1.292) 


-1.160 


K 


+2.370 


+2.165 


(+2.162) 


+1.953 



To demonstrate the latter point, the numbers in paren- 
theses in Table Q] give the parameters that would be ob- 
tained for the reference crystals if Vegard's law of lin- 
ear interpolation were valid. Linear interpolation works 
well in all cases, with a maximum error of 7% in the B 
parameter for Alo.5Gao.5As. This small error, in con- 
junction with the fact that the total variation is already 
a small perturbation, suggests that the linear perturba- 
tion theory developed in Sec. [Ill] should be a good ap- 
proximation for the momentum and mass parameters. 
(Quadratic-response contributions are included in the 
present calculations ; 49 ! 50 ' 51 but only to order k°.) 

Values for the linear heterostructure parameters de- 
fined in Sees. HIT] and [IV] are listed in Table HH Here A a 
denotes the total bulk value A a = A" a + A a ' + A a " = 
A' a ' + 2A a ". The only other quantities not yet defined 
are the B parameters B' a ' — D x s ^ + D v ^ , B" a = 
flj| tt + D v <^z , and B a " = + P>°gz- In this case 

B a " ^ B " a , so (unlike the other D terms) there are 
three independent linear parameters for B. The M and 
R values in Tabic Q arc the same as those in Table III of 
Ref. [5l|, but the other values are different. 

The operator ordering given by the parameters in Ta- 
ble [IT] does not seem to follow any simple general rules 
beyond the observation that \P a ' \ > \P' a \ for cation per- 
turbations and \P' a \ > \P a '\ for anion perturbations. In 
particular, the BcnDaniel-Duke approximation ; 21 ! 24 ! 64 in 
which mass terms of the form A' a ', B' a ', etc., are as- 
sumed to be dominant, is clearly not valid in most cases. 
(See Ref. Hi] for further discussion of this point.) How- 
ever, since the linear changes are also small in most cases, 
the particular choice of operator ordering in the present 
multiband model is not as important as it would be in a 



TABLE II: Linear parameters in the Tic-Tisv Hamiltonian. 
Here RC stands for reference crystal, and the labels light and 
heavy holes refer to the bulk properties in the (100) directions. 



RC 




Alo.5Gao.5As 


Ino.765Gao 


235 AS0.5P0.5 


a 




Ga 


As 


Ga 


Conduction 


A a 


-0.189 


+0.090 


-0.176 




A a - 


+0.099 


+0.039 


+0.121 




A" a 


-0.144 


+0.026 


-0.148 


Interband 


B a 


-1.151 


-0.023 


-0.566 




B'°" 


+0.369 


+0.104 


+0.329 




B a 


-0.367 


-0.162 


-0.132 




B a 


-1.154 


+0.035 


-0.763 


Momentum 


pa 


-0.019 


-0.060 


-0.009 




pa- 


-0.027 


-0.011 


-0.013 




p-OL 


+0.008 


-0.048 


+0.004 


Light hole 


L" 


-0.128 


-0.037 


+0.086 




L a - 


+0.058 


-0.081 


+0.103 




L" a 


-0.093 


+0.022 


-0.009 


Heavy hole 


M a 


-0.387 


-0.329 


+0.130 




M' a ' 


—0.039 


—0.109 


+0.093 




M a 


-0.174 


-0.110 


+0.018 


k 2 mixing 


N a 


-0.319 


-0.300 


+0.167 




N - a - 


-0.136 


-0.042 


+0.025 




N" a 


-0.091 


-0.129 


+0.071 


Lande 


K a 


+0.464 


+0.487 


-0.055 


Rashba 


K a 


+0.034 


+0.043 


+0.021 




K a " 


+0.215 


+0.222 


-0.038 


S mixing 


R a 


-0.028 


-0.017 


-0.038 



single-band model. 

A comparison of the parameters in Tables [J and |n] 
would seem to indicate some inconsistency in the calcula- 
tion. For example, since the difference in Ga content be- 
tween GaAs and AlAs is just 1, the linear bulk values A a , 
B a , etc., from TablclTTlshould be (approximately) numer- 
ically equal to the difference in the corresponding bulk 
constants of GaAs and AlAs from Table U (assuming that 
the variation is in fact linear, as suggested by the discus- 
sion of Tabic U above). However, ^(GaAs) - A (AlAs) = 
-0.119, whereas A a=Ga = -0.189. The error of -0.069 
in the value predicted by A a is much larger than the er- 
ror of —0.006 in the linear interpolation for A shown in 
Table |H 

The reason for the discrepancy is the different meth- 
ods used to eliminate interband coupling in the two 
cases. The bulk parameters in Table Q] were calculated 
by first diagonalizing the entire (A + B) Hamiltonian at 
k = exactly, and then using perturbation theory to 
eliminate the k • p coupling between A and B. How- 
ever, in the linear-response theory of Refs. [4^ and l5ll . 
the k-independent heterostructure perturbation and the 
k • p terms are all block-diagonalized together using a 
single unitary transformation. 80 ! 81 Since the heterostruc- 
ture perturbation X and the k • p perturbation Y do 
not commute, we have e si - x+Y) ^ e s( - X) e s( - Y \ and the 
two unitary transformations yield different bulk Hamil- 
tonian matrices for set A. But the difference is merely a 
k-dependent unitary transformation of the form defined 
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TABLE III: Error in linear-response prediction of the differ- 
ence in effective-mass parameters for bulk materials A and B. 
These values should satisfy Eq. ([29} if the "error" is not really 
an error but arises only from a unitary transformation. 
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A/B 


GaAs/AlAs 


Ino.53Gao.47As/InP 


AA 


-0.069 


-0.036 


AL 


+0.056 


+0.043 


AM 


-0.001 


-0.002 


AN 


+0.056 


+0.041 


AK 


+0.057 


+0.045 



previously in Eqs. © and 0. 

To demonstrate this, the "errors" in the predictions 
obtained from Table [II] for the differences in A, L, M, 
N, and K between materials A and B (e.g., GaAs and 
AlAs) were calculated from expressions of the form 

AL = ^> Q (A) - 6 a (B)]L a - [L(A) - L(B)). (48) 

a 

The results are shown in Table IIII1 If the discrepancy is 
really due to a unitary transformation of the form ([5]), 
these errors should obey the relations given previously in 
Eq. (f29|) . These relations are clearly not satisfied exactly, 
but the deviation from a pure unitary transformation, if 
divided equally between conduction and valence bands, 
amounts to only about 0.007 for GaAs/AlAs and 0.004 
for Ino.53Gao.47As/InP. This is just the magnitude of the 
linear interpolation error for these parameters shown in 
Table ID 

Hence, a careful examination of the material parame- 
ters shows that a linear approximation should work very 
well for the effective-mass and momentum terms. How- 
ever, it should be noted that the perturbation theory of 
Refs. HH and [HT| yields a k • p Hamiltonian that already 
includes a unitary transformation of the form ([6]) relative 
to the conventional Kane form of the k • p Hamiltonian. 
(As discussed in Sec. Ill Dt this is also the case for most 
empirical k • p data sets found in the literature; the dif- 
ference here is that in the present theory the effect of this 
transformation is known and has already been accounted 
for in the operator ordering for heterostructures.) 



B. Valence subband structure 

As a direct test of the present theory, the T\§ v&- 
lenceZ& subband structure was calculated numerically for 
Ino.53Gao.47As/InP and GaAs/AlAs superlattices in the 
LDA model system described abovei 50 ' 51 The transfor- 
mation © was applied to the set A — {ric,ri5 V } with 
A = \A for the reference crystal and A a " — <5>,,i^4 Q ", 
A a ' = S\,iA' a ' for the linear response, where A is a real 
parameter. Choosing A = 1 gives no transformation at 
all, but any value A ^ 1 modifies the bulk value of A and 
(for simplicity) sets the linear position dependence of A 
to zero. 



pa 



a 




X K 
Wave vector k 



FIG. 1: (Color online) Energy band structure of the bulk 
Ino.765Gao.235 AS0.5P0.5 reference crystal: comparison of exact 
calculation with 4-state k ■ p models. 



To determine the effect of different choices of A, recall 
from Sec. IIV El that the spurious solutions for k || (100) 
are evanescent when AL < and propagating when 
AL > 0. From Eq. (JSSJ we _have AL = -AA, hence 
L = L — (A — 1)A. Thus, A changes sign at A = 0, 
whereas L changes sign at A = 1 + L/A. Putting in 
the values of A and L for the reference crystals in Table 
HI1 one finds that for A = 1, the spurious solutions for 
Ino.765Gao.235Aso.5Po. 5 are evanescent, but values of A in 
the range —0.415 < A < yield spurious propagating 
modes. However, for Alo.5Gao.5As, where L w —A, spu- 
rious propagating modes occur only in the narrow region 
0<A< 0.002. 

To obtain the most rigorous test of the present theory, 
one can seek out the "worst case" value of A that gives 
real spurious wave vectors with the smallest magnitude. 
As shown in Sec. IIV El this case corresponds to A = L = 
\{A + L) or A = A = |(1 + L/A), which is halfway 
between the sign changes for A and L. Since L w —A 
for Alo.5Gao.5As, Eq. (|4"?1) shows that even the "worst 
case" real spurious solutions in this material will have 
extremely large wave vectors. Therefore, in what follows, 
only the Ino.53Gao.47As/InP material system is studied 
in detail, as this provides a more stringent test. In this 
system, the Ino.765Gao.235Aso.5Po. 5 reference crystal has 
A = -0.208. 

The energy band structure for Ino.765Gao.235Aso.5Po. 5 
is shown in Fig. [1] which compares the "exact" solutions 
of the model Hamiltonian with various k ■ p models. The 
k • p calculations for A = 1 and A = —0.208 are very sim- 
ilar for small k, but are visibly different for k near the 
Brillouin zone boundary. The real spurious solutions for 
A = -0.208 and k || (100) occur at k ~ ±15(27r/a), where 
a is the cubic lattice constant. Also shown in Fig. Q] are 
the results when the k • p Hamiltonian is extende d 49 ' 51 to 
include terms of order A; 3 and k 4 ; this case has more obvi- 
ous spurious solutions that occur well inside the Brillouin 
zone. 
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FIG. 2: (Color online) Tib valence subband structure of a 
(001) (Ino.53Gao.47As)24(InP)24 superlattice. 



The valence subband structure of a (001) 
(Ino.53Gao.47As)24(InP)24 superlattice was calcu- 
lated for a series of 0(k 2 ) EF models with A = 1, 
0.5, 0, —0.208, —0.5, and —1. These calculations were 
performed in momentum spaced with a basis containing 
25 EF plane waves (corresponding to a plane- wave cutoff 
at half the distance to the bulk X point). Since the real 
spurious solutions occur at > 15(27r/a), such a cutoff 
is sufficient to filter out the spurious mode a 25 i 32 ' 47 for 
any value of A. 

The results of these calculations are shown in Fig. [5] 
The entire range — 1 < A < 1 is designated by the single 
label EF(fc 2 ), since these values cannot be distinguished 
at this scale — they differ by no more than 0.1 meV for the 
top five subbands and by no more than 0.3 meV for any 
of the 12 subbands shown. The agreement with the exact 
calculations is excellent for the top five subbands (with 
a mean error in each subband of less than 1.8 meV), but 
it begins to deteriorate for energies more than 100 meV 
below the band edge. This discrepancy is due primarily 
to the neglect of terms of order k A in the bulk reference 
Hamiltonian. When these are included^ [see curves la- 
beled EF(fc )], the agreement is much improved, with a 
maximum mean error of 3 meV for the top 12 subbands. 
For the 0(k 4 ) calculations, the number of plane waves 
was reduced to 17 (i.e., one-third of the Brillouin zone) 
in order to avoid problems from the real spurious solu- 
tions in Fig. Q] 

The good agreement shown in Fig. [5] confirms the va- 
lidity of both the operator ordering derived here and the 
linear-response approximation used for n l and D lJ in the 
multiband EF Hamiltonian. (Quadratic-response terms 
were included only in the potential energ y 49 i 50 ' 51 ) Note 
that the 0.3 meV variation for —1 < A < 1 is an order 
of magnitude smaller than the 5 meV variation shown in 
Fig. 2 of Ref. [32|, which did not account for changes in 
operator ordering. 



VI. DISCUSSION AND CONCLUSIONS 

In conclusion, the unitary transformation ((6|) elimi- 
nates spurious solutions in the Kane model with no ap- 
proximation beyond the limitation to second-order differ- 
ential operators. A comparison of the derived operator 
ordering with density-functional calculations of the va- 
lence subband structure shows very good agreement. 

This good agreement was obtained using a linear- 
response approximation for the material dependence of 
the partially renormalized multiband effective-mass and 
momentum parameters, with quadratic "bowing" terms 
included only for the band-edge energies. As shown 
in Sec. IV A( such an approximation is justified in a 
multiband Hamiltonian because (unlike the single-band 
case^i) the corrections due to renormalization from the 
remote B states have a relatively weak variation between 
materials. Indeed, the present description of material 
properties is almost identical to the interpolation scheme 
for the conduction-band mass of ternary alloys recom- 
mended on page 5837 of Vurgaftman et al££ (although 
it should be noted that this scheme was not used uni- 
formly for all effective-mass parameters in Ref. |60| ) . The 
only difference is that the present work treats P as linear, 
whereas Vurgaftman et al. treat P 2 as linear^ 

It should be noted that the linear Hamiltonian (fl4|) 
is expressed in an atomistic form, as the superposition 
of responses to individual atomic perturbations. This is 
the simplest and most natural way of expressing the re- 
sults of linear response theory. Such a description may be 
unfamiliar to many readers since most envelope-function 
models are formulated in terms of bulk compounds rather 
than atoms. However, as shown in Sec. VII A of Ref. 
I-491 . a traditional bulk-crystal description can be obtained 
from the present atomistic formulation via a straight- 
forward linear transformation of variables (bearing in 
mind that the "bulk" compounds for the no-common- 
atom Ino.53Gao.47As/InP material system must include 
not just Ino.53Gao.47As and InP but also the interface 
materials InAs and Ino.53Gao.47P). Nevertheless, there 
are advantages to becoming familiar with both points of 
view, since the atomistic perspective is simpler and bet- 
ter suited for the description of complex nanostructures. 

Most envelope-function models are also expressed in a 
general form that allows the inclusion of arbitrary non- 
linear material dependence in the effective-mass and mo- 
mentum terms. However, the ability to include nonlin- 
ear terms does not necessarily imply greater accuracy, 
since the present results show that the operator ordering 
used in most envelope- function models is not correct even 
to linear order. The possibility of applying the present 
unitary transformation to a more general phenomenolog- 
ical Hamiltonian with arbitrary nonlinear material de- 
pendence was examined during the development of this 
paper, but since the interface structure of the resulting 
theory is much more complicated than the linear theory, 
it was not considered worthwhile to publish the nonlinear 
results. The linear approach has the advantage of pro- 
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viding simple analytical expressions for precisely those 
terms that are of greatest importance in a multiband 
envelope-function theory. 

For practical problems, a full implementation of the 
operator ordering derived here would require the knowl- 
edge of many parameters that have not been measured 
experimentally and cannot yet be predicted accurately 
from first principles. Therefore, in the near future, any 
practical application of the theory based purely on ex- 
isting empirical data will require the use of some ap- 
proximations. This point is underscored by the results 
obtained in Sec. Ill D| which show that the bulk k • p pa- 
rameters generated by typical experimental data-fitting 
procedures already include a unitary transformation — of 
unknown magnitude — of the type defined here. The un- 
certainty would seem to be greatest for the convenient 
tabulations in review articles^ of parameters compiled 
from many sources. 

Given such uncertainty in the existing experimental 
data, it is reasonable to base short-term applications of 
the present theory on the criterion of simplicity rather 
than theoretical rigor. If an unknown bulk unitary trans- 
formation is already present in the empirical parameters, 
the original LK basis cannot be defined experimentally, 
and it is not possible to make any definite statements 
about operator ordering in heterostructures. Therefore, 
one might as well choose something simple, such as the 
conventional BenDanicl-Duke operator ordering. For 
simplicity, one can apply this operator ordering after a 
bulk unitary transformation has been used to eliminate 
real spurious solutions, as in the heuristic model of Rcf. 

m 

Two choices of unitary transformations in heterostruc- 
tures stand out for their simplicity. One is to set A = 
everywhere^ which has computational advantages^ be- 
cause it allows the conduction-band envelopes to be elim- 
inated from explicit appearance in the envelope-function 
equations^ The other is to select a single value of P 
for the entire heterostructure ; 18 ' 27 which is chosen to 
yield evanescent spurious solutions in accordance with 
the guidelines given in Sec. irVTH Assuming that A ^ 0, 
this choice simplifies the interface boundary conditions 
(for calculations based on the flat-band approximation) 
because it ensures continuity of all envelopes i 14 i 15 

The above approach is merely a quick practical fix 
in which the uncertainty in experimental parameters is 
openly acknowledged and even turned to advantage by 
selecting simple operator ordering and a parameter set 
with no real spurious solutions. The resulting errors in 
operator ordering — which are probably systematic — are 
simply ignored. 

However, it is hoped that the present theory will also 
provide a stimulus for future work in which the sources 
of ambiguity in our present knowledge are steadily elim- 
inated. With a careful combination of experimental 
data and empirically-based microscopic theory (such as 
empirical pseudopotentialsS 2 - or empirical tight-binding 
theory 6 -) it should be possible to establish for each mate- 



rial whether spurious solutions in the Kane Hamiltonian 
are really required by experiment or are merely an ar- 
tifact of current data- fitting procedures. Application of 
the same methods to heterostructures will provide more 
definitive results for the parameters that determine op- 
erator ordering. At the same time, extensions of the 
present ab initio techniques to include quasiparticle self- 
energies and projector-augmented waves should provide 
more accurate predictions of parameters from first prin- 
ciples. It is hoped that at some time in the near future 
these two lines of investigation will converge to yield a 
practical k • p theory free from ambiguity. 
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APPENDIX A: MATRIX FORMULATION OF 
HAMILTONIAN CHANGES 

(n) 

Let Hm be the contribution to the matrix H that is of 
order 9 n k m , where 9 is the heterostructure perturbation 
parameter introduced in Eq. (|14a[) . Then the changes in 
the reference crystal Hamiltonian (TTJ due to the unitary 
transformation ^ are given by [cf. Eq. ^] 



r(0) c(0) 



(0) o(0)i 



(Al) 
(A2) 



while the changes in the linear Hamiltonian (fT4"|) are [cf. 
Eqs. flU and OH)] 



AH^ = [H^\S^] + [H^,Sn 



r(l) c(°) 



(0) 0(1)1 



(0) o(l)l 



(A3) 
(A4) 



in which H[ n) = H[ n) 



(n) 



APPENDIX B: COORDINATE AND VELOCITY 

This appendix examines the effect of the transforma- 
tion ([6]) on the coordinate and velocity. In the LK rep- 
resentation, the coordinate operator inside the first Bril- 
louin zone is just iVk^nn"^ After the k ■ p coupling be- 
tween A and B is eliminated, the effective coordinate for 
A becomes 

(nk|x|n'k') = x„„,(k)<5(k - k'), (Bla) 

in which the operator x n „/ (k) is given to first order in k 
by 



x rm' (k) — i Vk<5nn' 4" — f^rm' x k. 



(Bib) 
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Here ft nn > is the Berry curvature 8 ^^ 8 . 4 .^^ 6 . at k = for 
the quasi-Bloch (or transformed LK) basis: 

8 

finn' = i^Znl X £ln> , ( B2 ) 
I 

in which £ nn i is the crystal coordinate 8 ^ or Berry 
connection 8 ^ 

inn' — — (E n ^ E n i). (B3) 

^nn' 

The effective velocity v = — i[x, H] for A is therefore 
given (to first order in k) by 

v„„: (k) = V k H nn , (k) + l -^n nn , x k. (B4) 

Here the contribution from fi„ n / vanishes in a single- 
band effective-mass model, but not in a multiband model. 
This contribution is related 8 ^ to the so-called anomalous 
velocit y 82 i 83 or Hall velocity^ 6 - in an external field. J~2 
is a hermitian operator with the same symmetry as an 
angular momentum or a magnetic field (i.e., $1 is a pseu- 
dovector that is odd under time reversal). In a zinc- 
blende crystal, $1 has I^s' symmetry and couples the Tq 
conduction band to the Tg valence band in the presence 
of spin-orbit coupling. 

After the transformation x = e xe , the effective 
coordinate for A becomes 

x n „> (k) = «Vk<W + A£„„< + i H„ n / x k 

+ iMSL'+Snn')h, (B5) 

in which A£ nn < = +iAir nn > /u nn i and £l nn > = Sl nn > + 
Afl nn > , where 

A 

An nn , = i A £«* x &Zln> • (B6) 

I 

The transformed velocity v = e~ s ve s — —i[x,H] is 
given by Eq. (fT3]) . In this case, attempting to write v 



in a form analogous to Eq. (|B4[) yields a rather lengthy 
expression that is not given here. 



APPENDIX C: EQUATION flM]) 

This appendix contains a derivation of the expression 
for P 2 = mm fi P 2 (n) given in Eq. {32]) of Sec. lTvbl Here 
P 2 (n) is the smallest value of P 2 where any eigenvalue 
dl(n) = 0. To find the value of P 2 it is therefore nec- 
essary to determine the direction n in which d%(h) first 
reaches zero (for any v or n) as P 2 is increased from zero. 

The problem can be simplified by noting that in the 
PBA, the 6x6 VB block D v (n) can be further reduced 
to the direct sum of two 3x3 spin-zero blocks, since the 
mass parameters in the PBA do not depend on spin. The 
eigenvalues of these 3x3 matrices cannot be found ana- 
lytically for general n, but a useful approximate solution 
can be obtained from a rotated basisi {|A'), \Y'), \Z')} 
in which \Z') — n x \X) + n y \Y) + h z \Z). In this basis, the 
\Z') state is of principal interest because \X') and \Y') 
are not coupled to the CB by the k • p interaction. The 
corresponding diagonal matrix element of D v (n) is 

Dl, z ,{n) = L-2(L-M-N)(nlhl+h 2 z hl+n 2 x h 2 y ). (CI) 

For n in the (100), (110), and (111) directions, the matrix 
D v (h) is diagonal and Eq. (|C1|) is an exact eigenvalue. 
For other directions, Eq. (|C1|) is not an exact eigenvalue, 
but it does provide a useful qualitative description of the 
angular dependence of the exact solution. 

In typical semiconductors, the Luttinger parameters 
(original 66 or modified 48 ) satisfy 73 > 72 t 58 i 60 hence 
L-M-N = 3(73 - 72) > 0. Equation (JUT]) therefore 
suggests that the first eigenvalue dl(n) (for any direc- 
tion n) to reach <i^(n) = as P increases from zero will 
be the eigenvalue L corresponding to a state \Z') with 
n || (100). This tentative conclusion has been confirmed 
by a numerical examination of the eigenvalues <^(n) in 
different directions as P is varied. 

Therefore, when 73 > 72 (as is usually the case), the 
constant P 2 is given by Eq. ([39]) . 
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